Picture for Tong Zhang

Tong Zhang

Nanjing University of Science and Technology, Nanjing, China

ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Add code
Oct 14, 2025
Viaarxiv icon

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training

Add code
Oct 06, 2025
Viaarxiv icon

Generalizable Geometric Image Caption Synthesis

Add code
Sep 18, 2025
Viaarxiv icon

Theoretical Analysis on how Learning Rate Warmup Accelerates Convergence

Add code
Sep 09, 2025
Viaarxiv icon

CANDY: Benchmarking LLMs' Limitations and Assistive Potential in Chinese Misinformation Fact-Checking

Add code
Sep 04, 2025
Viaarxiv icon

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Add code
Sep 03, 2025
Viaarxiv icon

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Add code
Aug 27, 2025
Figure 1 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 2 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 3 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 4 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Viaarxiv icon

AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning

Add code
Aug 27, 2025
Viaarxiv icon

P/D-Device: Disaggregated Large Language Model between Cloud and Devices

Add code
Aug 12, 2025
Viaarxiv icon

Q-CLIP: Unleashing the Power of Vision-Language Models for Video Quality Assessment through Unified Cross-Modal Adaptation

Add code
Aug 08, 2025
Viaarxiv icon